Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models

نویسندگان

  • Gideon Mann
  • Ryan T. McDonald
  • Mehryar Mohri
  • Nathan Silberman
  • Dan Walker
چکیده

Training conditional maximum entropy models on massive data sets requires significant computational resources. We examine three common distributed training methods for conditional maxent: a distributed gradient computation method, a majority vote method, and a mixture weight method. We analyze and compare the CPU and network time complexity of each of these methods and present a theoretical analysis of conditional maxent models, including a study of the convergence of the mixture weight method, the most resource-efficient technique. We also report the results of large-scale experiments comparing these three methods which demonstrate the benefits of the mixture weight method: this method consumes less resources, while achieving a performance comparable to that of standard approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient sampling and feature selection in whole sentence maximum entropy language models

Conditional Maximum Entropy models have been successfully applied to estimating language model probabilities of the form , but are often too demanding computationally. Furthermore, the conditional framework does not lend itself to expressing global sentential phenomena. We have recently introduced a non-conditional Maximum Entropy language model which directly models the probability of an entir...

متن کامل

Closed-Form Training of Conditional Random Fields for Large Scale Image Segmentation

We present LS-CRF, a new method for very efficient large-scale training of Conditional Random Fields (CRFs). It is inspired by existing closed-form expressions for the maximum likelihood parameters of a generative graphical model with tree topology. LS-CRF training requires only solving a set of independent regression problems, for which closed-form expression as well as efficient iterative sol...

متن کامل

Computationally Efficient M-Estimation of Log-Linear Structure Models

We describe a new loss function, due to Jeon and Lin (2006), for estimating structured log-linear models on arbitrary features. The loss function can be seen as a (generative) alternative to maximum likelihood estimation with an interesting information-theoretic interpretation, and it is statistically consistent. It is substantially faster than maximum (conditional) likelihood estimation of con...

متن کامل

Minimum Entropy Estimation of Hierarchical Random Graph Parameters for Character Recognition

2.1 Entropy and mutual information In this paper, we propose a new parameter estimation method called minimum entropy estimation (MEE), which tries to minimize the conditional entropy of the models given the input data. Since there is no assumption in MEE for the correctness of the parameter space of models, MEE will perform not less than the other estimation methods such as maximum likelihood ...

متن کامل

Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms

We describe new algorithms for training tagging models, as an alternative to maximum-entropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a modification of the proof of convergence of the perceptron algorithm for classification problems. We giv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009